876 research outputs found
Melanoma Recognition by Fusing Convolutional Blocks and Dynamic Routing between Capsules
Skin cancer is one of the most common types of cancers in the world, with melanoma being the most lethal form. Automatic melanoma diagnosis from skin images has recently gained attention within the machine learning community, due to the complexity involved. In the past few years, convolutional neural network models have been commonly used to approach this issue. This type of model, however, presents disadvantages that sometimes hamper its application in real-world situations, e.g., the construction of transformation-invariant models and their inability to consider spatial hierarchies between entities within an image. Recently, Dynamic Routing between Capsules architecture (CapsNet) has been proposed to overcome such limitations. This work is aimed at proposing a new architecture which combines convolutional blocks with a customized CapsNet architecture, allowing for the extraction of richer abstract features. This architecture uses high-quality 299×299×3 skin lesion images, and a hyper-tuning of the main parameters is performed in order to ensure effective learning under limited training data. An extensive experimental study on eleven image datasets was conducted where the proposal significantly outperformed several state-of-the-art models. Finally, predictions made by the model were validated through the application of two modern model-agnostic interpretation tools
Speeding Up Evolutionary Learning Algorithms using GPUs
This paper propose a multithreaded Genetic
Programming classi cation evaluation model
using NVIDIA CUDA GPUs to reduce the
computational time due to the poor perfor-
mance in large problems. Two di erent clas-
si cation algorithms are benchmarked using
UCI Machine Learning data sets. Experi-
mental results compare the performance us-
ing single and multithreaded Java, C and
GPU code and show the e ciency far better
obtained by our proposal
Speeding up Multiple Instance Learning Classification Rules on GPUs
Multiple instance learning is a challenging task in supervised learning and data mining. How-
ever, algorithm performance becomes slow when learning from large-scale and high-dimensional data sets.
Graphics processing units (GPUs) are being used for reducing computing time of algorithms. This paper
presents an implementation of the G3P-MI algorithm on GPUs for solving multiple instance problems
using classification rules. The GPU model proposed is distributable to multiple GPUs, seeking for its scal-
ability across large-scale and high-dimensional data sets. The proposal is compared to the multi-threaded
CPU algorithm with SSE parallelism over a series of data sets. Experimental results report that the com-
putation time can be significantly reduced and its scalability improved. Specifically, an speedup of up
to 149× can be achieved over the multi-threaded CPU algorithm when using four GPUs, and the rules
interpreter achieves great efficiency and runs over 108 billion Genetic Programming operations per second
Parallel evaluation of Pittsburgh rule-based classifiers on GPUs
Individuals from Pittsburgh rule-based classifiers represent a complete solution
to the classification problem and each individual is a variable-length set
of rules. Therefore, these systems usually demand a high level of computational
resources and run-time, which increases as the complexity and the size
of the data sets. It is known that this computational cost is mainly due to
the recurring evaluation process of the rules and the individuals as rule sets.
In this paper we propose a parallel evaluation model of rules and rule sets on
GPUs based on the NVIDIA CUDA programming model which significantly
allows reducing the run-time and speeding up the algorithm. The results
obtained from the experimental study support the great efficiency and high
performance of the GPU model, which is scalable to multiple GPU devices.
The GPU model achieves a rule interpreter performance of up to 64 billion
operations per second and the evaluation of the individuals is speeded up of
up to 3.461× when compared to the CPU model. This provides a significant
advantage of the GPU model, especially addressing large and complex
problems within reasonable time, where the CPU run-time is not acceptabl
Multi-Objective Genetic Programming for Feature Extraction and Data Visualization
Feature extraction transforms high dimensional
data into a new subspace of lower dimensionalitywhile keeping
the classification accuracy. Traditional algorithms do not
consider the multi-objective nature of this task. Data transformations
should improve the classification performance
on the new subspace, as well as to facilitate data visualization,
which has attracted increasing attention in recent years.
Moreover, new challenges arising in data mining, such as
the need to deal with imbalanced data sets call for new algorithms
capable of handling this type of data. This paper
presents a Pareto-basedmulti-objective genetic programming
algorithm for feature extraction and data visualization. The
algorithm is designed to obtain data transformations that optimize
the classification and visualization performance both
on balanced and imbalanced data. Six classification and visualization
measures are identified as objectives to be optimized
by the multi-objective algorithm. The algorithm is
evaluated and compared to 11 well-known feature extraction
methods, and to the performance on the original high
dimensional data. Experimental results on 22 balanced and
20 imbalanced data sets show that it performs very well on
both types of data, which is its significant advantage over
existing feature extraction algorithms
Improving the understanding of cancer in a descriptive way: An emerging pattern mining-based approach
This paper presents an approach based on emerging pattern mining to analyse cancer through genomic data. Unlike existing approaches, mainly focused on predictive purposes, the proposal aims to improve the understanding of cancer descriptively, not requiring either any prior knowledge or hypothesis to be validated. Additionally, it enables to consider high-order relationships, so not only essential genes related to the disease are considered, but also the combined effect of various secondary genes that can influence different pathways directly or indirectly related to the disease. The prime hypothesis is that splitting genomic cancer data into two subsets, that is, cases and controls, will allow us to determine which genes, and their expressions, are associated with different cancer types. The possibilities of the proposal are demonstrated by analyzing RNA-Seq data for six different types of cancer: breast, colon, lung, thyroid, prostate, and kidney. Some of the extracted insights were already described in the related literature as good cancer bio-markers, while others have not been described yet mainly due to existing techniques are biased by prior knowledge provided by biological databases
Improving spiking neural network performance with auxiliary learning
The use of back propagation through the time learning rule enabled the supervised training of deep spiking neural networks to process temporal neuromorphic data. However, their performance is still below non-spiking neural networks. Previous work pointed out that one of the main causes is the limited number of neuromorphic data currently available, which are also difficult to generate. With the goal of overcoming this problem, we explore the usage of auxiliary learning as a means of helping spiking neural networks to identify more general features. Tests are performed on neuromorphic DVS-CIFAR10 and DVS128-Gesture datasets. The results indicate that training with auxiliary learning tasks improves their accuracy, albeit slightly. Different scenarios, including manual and automatic combination losses using implicit differentiation, are explored to analyze the usage of auxiliary tasks
Sistema de consulta de notas a través de Páginas Web y de Telefonía Móvil
In this paper we are going to present a system which lets any student at Cordoba University knows what the final qualifications he has obtained in the different subjects he is registered are. The sytem lets him consult the qualifications using Internet and mobile telephony. Teachers who are reagistered in the system can upload tha qualifications without addtional software, they only need the certificate file generated by the UcoActas application. Then, all the registered students in this subject and in this service, will receive an e-mail to notify them that they can consult the qualifications. Then, they can use a Web browser or they can use a specific mobile application to consult them. Currently we are doing functionality tests with voluntary students from different courses.En este artículo se presenta un sistema que permite a cualquier alumno de la Universidad de Córdoba informarse sobre las calificaciones finales obtenidas en las asignaturas en las que está matriculado. El sistema permite consultar las notas de las actas utilizando Internet y Telefonía Móvil. Los profesores que estén dados de alta en el sistema podrán subir las notas sin necesidad de utilizar un tercer "software", directamente con el fichero de las actas generado por el programa UcoActas. Una vez subidas las notas, todos los alumnos matriculados que estén suscritos a este servicio, recibirán un e-mail con el aviso de que la consulta en la Web está disponible. Si el alumno lo desea, también podrá consultar sus notas publicadas utilizando un programa para dispositivos móviles que hará una petición segura al sistema. Actualmente se están realizando pruebas de actualización con estudiantes voluntarios de varias asignaturas
Scalable CAIM Discretization on Multiple GPUs Using Concurrent Kernels
CAIM(Class-Attribute InterdependenceMaximization) is one of the stateof-
the-art algorithms for discretizing data for which classes are known. However, it
may take a long time when run on high-dimensional large-scale data, with large number
of attributes and/or instances. This paper presents a solution to this problem by
introducing a GPU-based implementation of the CAIM algorithm that significantly
speeds up the discretization process on big complex data sets. The GPU-based implementation
is scalable to multiple GPU devices and enables the use of concurrent
kernels execution capabilities ofmodernGPUs. The CAIMGPU-basedmodel is evaluated
and compared with the original CAIM using single and multi-threaded parallel
configurations on 40 data sets with different characteristics. The results show great
speedup, up to 139 times faster using 4 GPUs, which makes discretization of big
data efficient and manageable. For example, discretization time of one big data set is
reduced from 2 hours to less than 2 minute
A Classification Module for Genetic Programming Algorithms in JCLEC
JCLEC-Classi cation is a usable and extensible open source library for genetic program-
ming classi cation algorithms. It houses implementations of rule-based methods for clas-
si cation based on genetic programming, supporting multiple model representations and
providing to users the tools to implement any classi er easily. The software is written in
Java and it is available from http://jclec.sourceforge.net/classification under the
GPL licens
- …